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Abstract — This paper addresses a weighted sum rate (WSR) 
maximization problem for downlink OFDMA aided by a decode- 
and-forward (DF) relay under a total power constraint. A novel 
subcarrier-pair based opportunistic DF relaying protocol is pro- 
posed. Specifically, user message bits are transmitted in two time 
slots. A subcarrier in the first slot can be paired with a subcarrier 
in the second slot for the DF relay-aided transmission to a user. In 
particular, the source and the relay can transmit simultaneously 
to implement beamforming at the subcarrier in the second slot. 
Each unpaired subcarrier in either the first or second slot is 
used for the source's direct transmission to a user. A benchmark 
protocol, same as the proposed one except that the transmit 
beamforming is not used for the relay-aided transmission, is also 
considered. For each protocol, a polynomial-complexity algorithm 
is developed to find at least an approximately optimum resource 
allocation (RA), by using continuous relaxation, the dual method, 
and Hungarian algorithm. Instrumental to the algorithm design 
is an elegant definition of optimization variables, motivated by 
the idea of regarding the unpaired subcarriers as virtual subcarrier 
pairs in the direct transmission mode. The effectiveness of the RA 
algorithm and the impact of relay position and total power on the 
protocols' performance are illustrated by numerical experiments. 
It is shown that for each protocol, it is more likely to pair 
subcarriers for relay-aided transmission when the total power 
is low and the relay lies in the middle between the source and 
user region. The proposed protocol always leads to a maximum 
WSR equal to or greater than that for the benchmark one, and 
the performance gain of using the proposed one is significant 
especially when the relay is in close proximity to the source 
and the total power is low. Theoretical analysis is presented to 
interpret these observations. 

Index Terms — Resource allocation, decode and forward, trans- 
mit beamforming, subcarrier pairing, orthogonal frequency di- 
vision multiple access, convex optimization. 
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I. Introduction 

Orthogonal frequency division multiple access (OFDMA) 
has been widely recognized as one of the dominant wireless 
technologies for high data-rate transmission. One of the main 
reasons behind this fact is that spectral efficiency of the 
OFDM(A) systems can be improved significantly by proper 
resource allocation (RA) when transmitter channel state infor- 
mation (CSI) is available |fl]]-|[3]]. The incorporation of decode- 
and-forward (DF) and amplify-and-forward (AF) relaying into 
OFDM(A) systems through subcarrier-pair based protocols 
and associated RA have lately been under intensive investi- 
gation l4l- ||29l . This class of protocols share the following 
features. User message bits are transmitted during two con- 
secutive equal-duration time slots. In the first slot, the source 
broadcasts OFDM symbols, so does the relay in the second 
slot. The source might also emit OFDM symbols during the 
second slot as will be elaborated later. A subcarrier in the 
first slot can be paired with a subcarrier in the second slot for 
transmitting message bits with DF/AF relaying, referred to as 
the relay-aided transmission mode hereafter. 

In this paper, we focus on RA for downlink OFDMA with 
subcarrier-pair based DF relaying (there also exist works on 
RA for OFDMA systems using bidirectional relaying [4]). The 
subcarrier-pair based AF relaying has been studied in (23-J81. 
Note that the subcarrier-by-subcarrier based pairing may not 
be sufficient for DF relaying, since the information from a set 
of subcarriers in the first time slot can be decoded and re- 
encoded jointly and then forwarded through a different set of 
subcarriers in the second time slot H, lfP2l . Nevertheless, the 
subcarrier-pair based DF relaying has attracted much research 
interest due to simplicity or practical reasons ll9l- ll29l . 

When the source-to-destination (S-D) link is unavailable 
(i.e., the destination lies outside the source's radio coverage), 
RA problems for OFDM systems using subcarrier-pair based 
DF protocols have been addressed in ll9l- llT2l . In these works, 
every subcarrier in the first time slot is paired with a subcarrier 
in the second time slot for the relay-aided transmission, as il- 
lustrated in Fig. Q]a. To maximize sum rate under a total power 
constraint, ordered subcarrier pairing has been proven to be the 
optimum, i.e., the strongest source-to-relay subcarrier should 
be paired with the strongest relay-to-destination subcarrier, and 
so on. 

The works in lfT3l - l29l have considered the case where 
the S-D link is available. When only the relay emits OFDM 
symbols in the second time slot, opportunistic relaying (some- 
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| relay-aided mode | 



H direct mode \/ / / //^ no source/relay 




(a) when the S-D link is unavailable 191— 1121 . 




(b) when the S-D link is available but the source does not transmit 
in the second slot 1 131 - 1201 . 




(c) when th e S- D lin k is available and the source transmits in the 
second slot (2~fl (29l 

Fig. 1. Illustration of the subcarrier-pair based DF relaying protocols for 
OFDM(A)-based systems, where every arrow indicates that the two associated 
subcaniers are paired for the relay-aided transmission. 



times termed as selection relaying) was studied in |[T3l - ||20l . 
Specifically, a subcarrier in the first time slot can either be 
paired with a subcarrier in the second slot for the relay-aided 
transmission, or used directly for the S-D transmission without 
the relay's assistance, referred to as the direct transmission 
mode hereafter. It is very important to note that when some 
subcarriers in the first slot are used in the direct transmission 
mode, some subcarriers in the second slot will not be used 
as illustrated in Fig. Q]b, which leads to a waste of precious 
spectrum resource. 

To address the above issue, improved protocols which allow 
the source to emit OFDM symbols in the second slot were 
proposed and studied in ll2D - ||29l . The improved protocols 
are the same as those considered in lfT3l - |[T8l , except that the 
source can also make direct S-D transmission at every unpaired 
subcarrier in the second slot, as illustrated in Fig. [T]c. Note 
that the improved protocols do not really improve the way that 
DF relaying is implemented over a subcarrier pair, but rather 
let the source utilize the unpaired subcarriers in the second slot 
for direct transmission to avoid the waste of spectrum resource. 
In 11241 . (27), (29), the subcarrier pairing and power allocation 
are jointly optimized for point-to-point OFDM systems. As 
for OFDMA systems, RA problems considering the joint op- 
timization of power allocation, subcarrier assignment to users 
and selection of multiple relays for transmit beamforming in 
the second slot are addressed in fl25l , ||26) . In these works, 
a priori and CSI-independent subcarrier pairing is considered, 
i.e., a subcarrier in the first slot is always paired with the 



same subcarrier in the second slot if the relay-aided mode is 
used. The optimization of subcarrier pairing and assignment to 
users is addressed in ||28l with a graph based approach. It is a 
complicated RA problem to jointly optimize subcarrier pairing 
and mode selection with power allocation and subcarrier 
assignment to users. 

Compared with the above existing works, this paper makes 
the following contributions: 

• A novel subcarrier-pair based opportunistic DF protocol 
is proposed for downlink OFDMA aided by a DF relay. 
This protocol further makes improvement over those pre- 
viously studied in the literature i2Tl - ||29l , by allowing the 
source and the relay to implement transmit beamforming 
at a subcarrier in the second time slot for the relay- 
aided transmission. Note that the protocols studied in 
ED, l26l considered the selection of multiple DF relays 
(excluding the source) for transmit beamforming in the 
second slot, while the proposed protocol considers the 
joint source-relay transmit beamforming. A benchmark 
protocol, which is the same as the proposed one except 
for the relay-aided transmission mode, is also consid- 
ered. Note that the proposed protocol truly improves the 
implementation of DF relaying over a subcarrier pair 
with transmit beamforming, which is not the case for the 
benchmark protocol. 

• The weighted sum rate (WSR) maximized RA problem is 
addressed for both the proposed and benchmark protocols 
under a total power constraint for the whole system. First, 
it is shown that the proposed protocol leads to a maximum 
WSR not smaller than that for the benchmark one. Then, 
an algorithm is developed for each protocol to find at least 
an approximately optimum RA with a WSR very close 
to the maximum WSR. Instrumental to the elegance of 
the RA algorithm is a definition of appropriate indicator 
variables, making it possible to cast a subproblem related 
to the joint optimization of transmission-mode selection, 
subcarrier pairing and assignment to users into an stan- 
dard assignment problem that can be solved efficiently 
by Hungarian algorithm. 

The rest of this paper is organized as follows. In the next 
section, the system and transmission protocols are described. 
The theoretical analysis is made to compare the maximum 
WSRs of the two protocols in Section Hill After that, the RA 
algorithm is developed in Section |IV] Numerical experiments 
are shown to illustrate the effectiveness of the RA algorithm 
and study the impact of relay position and total power on the 
protocols' performance in Section |V] Finally, some conclu- 
sions are drawn. 

Notations: A letter in bold, e.g. x, represents a set. C(x) = 
ilog 2 (l + x). 

II. Protocols and WSR maximization problem 

A. The transmission system and protocols 

Consider the downlink OFDMA transmission from a source 
to U users (user u = 1, . . . , U) aided by a DF relay. The 
source, relay and every user are each equipped with a single 
antenna, and the channel between every two of them is 
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frequency selective. The source and the relay are synchronized 
so that they can simultaneously emit OFDM symbols using K 
subcarriers and with sufficiently long cyclic prefix to eliminate 
inter-symbol interference. 

The novel transmission protocol is half-duplex, i.e., user 
message bits are transmitted in two consecutive equal-duration 
time slots, during which all channels are assumed to keep 
unchanged. During the first slot, only the source broadcasts 
N OFDM symbols. Both the relay and all users receive these 
symbols. After proper processing explained later, the source 
and relay simultaneously broadcast N OFDM symbols, and 
the users receive them during the second slot. 

Due to the OFDMA, each subcarrier is dedicated to trans- 
mitting a single user's message exclusively. A subcarrier in 
the first slot can be paired with a subcarrier in the second slot 
for the relay-aided mode transmission to a user. Each unpaired 
subcarrier in either the first or second slot is used by the source 
for the direct mode transmission to a user. 

To simplify description, we use subcarriers fc and I to denote 
the fcth and Ith subcarriers used during the first and second 
slots, respectively (k,l = 1, ■ • • ,K). We define the source 
transmission powers for subcarrier k in the first slot and 
subcarrier I in the second slot as P s ,k,i and P s ,i.2, respectively. 
The relay transmission power for subcarrier I is P r ,i,2- The 
complex amplitude gains at subcarrier k for the source-to- 
relay, source-to-w and relay-to-u channels are h SI; k, h SUi k and 
h T u,k, respectively. The two transmission modes for the novel 
protocol are elaborated as follows: 

1) The relay-aided transmission mode: Suppose subcarrier 
k is paired with subcarrier I for the relay-aided mode transmis- 
sion to user u. In such a case, we refer to the two subcarriers 
collectively as the subcarrier pair (fc, I), A block of message 
bits are first encoded into a code word of complex symbols 
{6(n)\n = I,-- - ,N} with E(\0(n)\ 2 ) = 1, V n. In the first 
slot, the source broadcasts the codeword over subcarrier fc 
as illustrated in Figure |2]a. At the relay and user u, the nth 
baseband signals received through subcarrier fc are 



yr,k( n ) = \/Ps,k,ihsr,kO{n) + z r , fe (n),n = !,-■■ ,N, (1) 



and 



y u ,k,i( n ) = \J PsM,ih su .kO{n) + z tt ,fe,i(n), n = 1, 



,N, 



(2) 



respectively, where z T) k{n) and z u ki (n) are both additive 
white Gaussian noise (AWGN) with power a 2 . The signal- 
to-noise ratio (SNR) at the relay is P s ,k.iG srtk where G sr ,fc = 

\h I 2 

1 "<2 . At the end of the first time slot, the relay decodes 
the message bits from {y r ^(n)\n = 1, • • ■ , N} and then 
reencodes those bits into the same codeword as the source 
did. 

In the second time slot, the source and relay broadcast 
the codewords {6(n) e - jZh ^- 1 |V n} and {6{n)e- jAh ™> 1 |V n} 
through subcarrier /, respectively, where Zh SUt i and /Ji TU ,i 
represent the phase of h SU: i and h IUt i, respectively. This means 
that the source and relay implement transmit beamforming to 
emit the codeword through subcarrier I as illustrated in Figure 
|2]b. Note that the source and relay need to know the phase 




-^0 



trans mission over subcarrier k in the first slot ►» transmission over subcarrier I in the second slot 

(a) (b) 

Fig. 2. The relay-aided transmission mode over the subcarrier pair (fc, I) to 
user u. 



of h su j and h TU) i, respectively. At user u, the nth baseband 
signal received through subcarrier I is 



y u ,l,2( n ) = (V P s,i,2\h su .i \ + \/Pr,i.2\Ku.i\)0(n) + z u ^ L2 {n) 7 

(3) 

where z u ,i t i(n) is the AWGN with power a 2 . 

Finally, user u decodes the message bits from all signals 
received during the two slots. These signals can be grouped 
into N vectors, the nth of which is 



y(") 



yu,k,i(n) 

Vu,l.2( n ) 



(4) 



y/P s ,k,lh s 



6{n) +z(n), 



s/P s ,i,2\h S u,i \ + \fPja\ 

where z(n) = [z u ,fc,i(n), z u ,i,2{n)] T - Note that the trans- 
mission in effect makes N uses of a discrete memoryless 
single-input-two-output channel specified by (0|, with the 
?7th input and output being 9{n) and y(n), respectively. To 
achieve the maximum reliable transmission rate, maximum 
ratio combining should be used fl30| , i.e., user u first turns 
every y(n) into a decision variable 



c(n) = (\/P s ,ksh S u : k)*yu,k,i(n)+ 

(\/PsJ,2\ h su.l \ + y / PrJ,2\ h ru.l\)*yu.l,2( n )' 

(5) 

and then decodes the message from {c(n)|V n}. It can readily 
be derived that the SNR for this decoding is 

lklu(P s ,k,l, Ps,l,2, Pr.1,2) = G S u,kPe,k,l + 



( \J G su ,lP s ,l,2 + V GriU-Pr.u) ) (6) 



and G rtU = 



where G su>k = 

To ensure both the relay and user u can reliably de- 
code the message bits, the maximum number of mes- 
sage bits that can be transmitted is 2NC(G SIi kP s ,k,i) 
and 2NC(-y k iu(Ps,k,i, P S ,L2, P r ,L2)), respectively. This means 
that the maximum transmission rate over the subcarrier 
pair (fc, I) in the relay-aided mode to user u is equal 
to C(min{G sr iP s ,fe,i,7«„(P s , fe ,i,P s , i ,2 ! Pr,i,2)}) bits/OFDM- 
symbol (bposlJ. 

2) The direct transmission mode: Suppose subcarrier fc 
(respectively, subcarrier Z) is unpaired with any subcarrier in 
the second (respectively, first) slot, and is used for direct mode 
transmission to user u. The source first encodes message bits 
into a codeword of N symbols, which are then broadcast 
through subcarrier fc (respectively, subcarrier I. In such a case, 

1 Recall that 2 N OFDM symbols are used in total during the two time slots. 
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the relay keeps silent at subcarrier I in the second slot, i.e. 
P-,z,2 = 0.). User u decodes the message bits from the signals 
received through subcarrier k (respectively, subcarrier /). The 
maximum rate through subcarrier k (respectively, subcarrier I) 
in the direct transmission mode is C(P s . k .iG su . k ) (respectively, 
C(P s ,l,2 G su,l)) bpos. 

A benchmark protocol is also considered. This protocol 
is the same as the novel protocol except for the relay-aided 
transmission mode. Specifically, the relay-aided mode is the 
same as that widely studied in the literature lfT3l - lfT8l . lETl - 
ll27l . i.e., the source does not transmit at subcarrier I during the 
second slot, if subcarriers k and I are paired for the relay-aided 
transmission to user u. In such a case, the maximum rate for 
the relay-aided transmission over that subcarrier pair to user 
u is equal to C(min{G 8r> fcP 8ifci i, G BV , t kP B ,k,i + G IU ,iPr,i,2}) 
bpos. It is important to note that, the benchmark protocol is a 
special case of the novel protocol, since it is equivalent to the 
novel protocol with the constraint that P s ,;.2 = if subcarrier 
I is paired with a subcarrier in the first slot for the relay-aided 
mode transmission. 

B. The WSR maximization problem 

We assume there exists a central controller which knows 
precisely the CSI {G SIl k, G S u,k, G rUl fc|V k}- Before the data 
transmission, the controller needs to find the optimum subcar- 
rier and power assignment, i.e., which subcarriers should be 
paired for the relay-aided mode and which should be in the 
direct mode, how these subcarriers should be assigned to the 
users, as well as the source/relay power allocation to maximize 
the WSR of all users for the adopted transmission protocol 
(which can be either the novel or benchmark protocol), when 
the total power consumption is not higher than a prescribed 
value P t . Then, the controller can inform the source and the 
relay about the optimum subcarrier and power assignment to 
be adopted for data transmission. 

III. Theoretical analysis 

It can be shown that the proposed protocol leads to a 
maximum WSR greater than or equal to that for the bench- 
mark protocol. To this end, suppose the optimum subcarrier 
assignment and power allocation has been found for the 
benchmark protocol. By using the proposed protocol with the 
same subcarrier assignment and power allocation, the same 
WSR can be achieved. Obviously, the maximum WSR for the 
proposed protocol is greater than or equal to that WSR, namely 
the maximum WSR for the benchmark protocol. 

In Section IIII-A1 we assume subcarriers k and I are paired 
for the relay-aided mode transmission to user u, and a sum 
power P is used for this pair. We focus on computing the 
maximum rate and optimum power allocation of this pair for 
both protocols. Using these results, theoretical analysis will 
be made in Section IIII-BI to show when the maximum WSR 
for the proposed protocol is strictly greater than that for the 
benchmark one, and the RA algorithm will be developed in 
Section [TV] Moreover, this analysis plays an important role 
to interpret the numerical experiments shown in Section fVl to 
illustrate the impact of the relay's position on the benefit of 
using the proposed protocol. 



A. Rate maximization for the pair in the relay-aided mode 

1 ) Analysis for the proposed protocol: To facilitate deriva- 
tion, define A u . k = G sr . k - G su . k and G u .j = G s „ )( + G r „ i( . 
To maximize the rate, the optimum P s . k .i, Ps,z.2 and P T ,i,2 are 
the optimum solution for 

max min{G SI . ! i ; P S) fe l i,7fe iu (P Si fe i i,P S! i i 2,Pr,/,2)} 
s.t. P SiM + P/.2 + I'-ji = P (7) 

P S ,k,l > 0, P S ,Z,2 > 0, P, ; , 2 > 0. 

By using the Cauchy-Schwartz inequality, it can be shown 
that 

Jklu{Ps,k,l, Ps,i,2, P',/,2) < G su ^P s ,k,l + G Ui ;P 2 , (8) 

where P2 = P s ,;, 2 + P,z,2 and the inequality is tight when 
P s l 2 = 5^ip 2 and p.^2 = % }hL P2- Now, the optimum 
solution for can be found by first solving 

max min{G sl . ifc P SiM ,G stl , fc P s , M + G U;/ P 2 } (9) 

s.t. P S)M + P 2 = P, P,M > 0, P 2 > 

for the optimum P Sj fe i and P 2 , and then using that P 2 
to compute the optimum P s ,;, 2 and P r ,;,2 according to the 
formulas that tighten the inequality ([8}. Problem ((9) can be 
solved intuitively as follows. First, the two lines 

£0 = {(x,y (x))\x e [0, P],y (x) = G sr ^x} 

d = {(x, yi (x))\x e [0,P], yi (x) = G SUik x + G Ut i{P - x)} 

can be plot over the two-dimensional plane of coordinates 
(x, y) in Fig. [3] It can be seen that three different cases are 
possible, each corresponding to a specific orientation of the 
two lines. The coordinates of points A, B, C and D in the 
figure are shown in Table [Q The optimum P s , k .i and objective 
value for © (which are also for (|7)) are equal to the x and y 
coordinates of the points A, B and D for the three cases in 
Fig. |3] respectively. From this fact, it can easily be seen that 
the optimum P s .fe.i> Ps,i.2 and P rj z. 2 for (Q are 

p = J A u G k +G u i P if mm {G s r,fc,G„,;} > G SU)k , 

p if 



P —J ^ff (A„t+G» i) P ^ mm {Gsr,fc, G u j} > G su ^ k , 

if 

and 

Pl = J g""' (A„t+'G U ,) P if min{G SI , fc ,G u ,J > G su , fc , 

if min{G srj ^, G u ^{\ ^ G su ^ k . 

The maximum rate associated with the above optimum 
solution is equal to C(G^ [u P) with 



Gn 
kh 



if min{G sr , fc ,G u ,/} > G 
min{G sr)fc ,G s „ )fc } if min{G srifc , G u .i} < G s 



su,k-> 



(10) 
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(a) G„ t < GU, fc (b) G s t,k > G m , k > G„j (c) min{G„. 1: , G„,,} > G„,» 

Fig. 3. Illustration of the two lines £o an d £l in three different cases. 

TABLE I 

Coordinates of A, B, C and D in Figure[3] 





A 


B 


C 


D 


X 


p 


P 





g u j p 


y 


G SVt kP 




G Utl P 


G sli fcG lt ; p 



in Fig. [4] The coordinates of points A and £> are the same 
as given in Tab. U and those of points C\, C 2 , D\ and D 2 
are given in Tab. HTl Most interestingly, G klu P and G\ lu P are 
equal to the y-coordinate of D\ and D2, respectively, and C\ 
is above G2 since G u .i > G r u,z- In particular, the following 
points should be noted: 

• G klu > G\i u holds because D\ is above D2. 

* when G sr fc increases (meaning that point A is elevated), 
G klu — G\ lu increases (since the difference of the y- 
coordinate of points D\ and D2 is increased). 

> when G IU ,i increases (meaning that points C\ and C2 are 
both elevated), G klu — G\ lu reduces, because 

^■U.k G su l G sr k 



^klu 



G 



klu 



(A U] fe + G SUj ; + G ra ,z)(A u ,fc + G„u) 

is a decreasing function of G ru ,i- 



2) Analysis for the benchmark protocol: In this case, 
-fs,;,2 = and the optimum P s>k> i and P r ,i,2 f° r maximizing 
the rate are the optimum solution for 



max min{P s fe iG sr fc, R A;, iG s 

fs,h, I ,i'r,I,a 



-Pr,Z,2G 1Uj i} 



(ID 



S.t. P S)M + P r ,i, 2 = P,P s ,fc,l > ()./'■ / J > 0, 

which can also be solved by the intuitive method as described 
above. It can be shown that the optimum P s .k.i and P r ,i.2 are 

A u G k +G ru i P if niin{G sr , fe ,G ru ,/} > G su ,k, 
P if min{G sr .fc, G Y u,i\ ^ G SUykl 




A„ f+G ru 1 P ^ mm {G S r,fc, G IU ,l} > G su ,k, 

if min{G srj ^, G Y u,i\ G SU:k , 

and the maximum rate associated with the above optimum 
solution is equal to C{G\ lu P) with 



G b Mu = l A„, fc +G 



if min{G sr ,fc,G ra>i } > G SUjfc , 
mi^Gsr^^su^} if min{G sr ^,G rU! i} < G SU! fc- 

(12) 



B. Comparison of the two protocols 

To compare the maximum WSR for the two protocols, 
it is necessary to first compare G klu and G\ lu . When 
G S uM > min{G ai! k,G IUt i}, G\ lu = mm{G aIik ,G SUik }. If 
min{G sri fe, G u j} < G SUyk , G klu = min{G S r,fc, G SU:k } = 
G\ lu follows. If mm{G srt k,G Ui i} > G su . k , it can be seen 
that G k[u P and G SUyk P correspond to the ^-coordinates of 
points D and B in Figure 0c, respectively, and therefore 



G klu > G SUyk since D is higher than B. This means that 
G kiu ^ G kiu always holds when G SUyk > min{G sl%fc , G rll .J. 

When min{G srjfc ,G rUi i} > G SU)k , G% lu and G\ lu can be 
compared through a visualization method as follows. Specifi- 
cally, we plot the lines Co, C\ and 

C 2 = {(x,y 2 (x))\xe[0,P], (13) 
y 2 (x) = G SUik x + G IU ,i(P ~ x)} 



■ Vo{x) = G ar> fcX d 

■ yi{x) = G, u ,kX + G u ,i(P-x) 
- y 2 (x) = G su ,kx + Gru,i(P - x) 




Fig. 4. Illustration of G\ lu and G£, u when min{G sr , fe , G rUji } > G su ,fc. 



TABLE II 

COORDINATES OF C\ , Cl, D\ AND £>2 IN FlG.|4] 





Ci 


c 2 




D 2 


X 








P 


G ru , 


A„, fc +G„, 1 - r 


A„, fc +G ru . ! - r 


y 


G uA P 


G TU ,iP 







The above analysis indicates that G klu > G klu always 
b 



holds, and Gl lu 



G klu increases when either G sr _ k increases 



or G rUi ; reduces, if mm{G sr , fc , G IU ,i} > G SU;fc . 

Using the above results, we now show that the proposed 
protocol leads to a strictly higher maximum WSR than the 
benchmark protocol, if there exist at least two subcarriers 
that must be paired for the relay-aided transmission for the 
benchmark protocol to maximize the WSR. To this end, 
collect the subcarrier pairs that must be used by the bench- 
mark protocol to maximize the WSR in the set $, and 
V (k,l) G denote u k i and P Ukl as the user which should 
use this subcarrier pair and the sum power that should be 
assigned to this pair. The rate contributed by this pair must 
be equal to C(G klUk[ P Ukl ) as shown earlier. In such a case, 
min{G sr ,fe, G rUkl ,i} > G SUkl ,k must be satisfied, because 
otherwise simply using subcarriers k and I separately in the 
direct mode can lead to a higher sum rate. Suppose the 
proposed protocol is now used with a suboptimum RA which 
adopts the same subcarrier assignment as the optimum RA 
for the benchmark protocol. For every subcarrier in the dire ct 
mode, this RA uses the same source power allocation as the 
optimum value for the benchmark protocol, and V (k,l) £ 
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this RA uses P Ukl as the sum power for the subcarrier pair 
(k, I). The maximum rate for this subcarrier pair is equal to 
C ( G klu M P u k i)- Since miri{G sr , fc , G IUklt i} > G SUkl:k holds, 
G's-Zti > G\ lu follows from earlier analysis, and therefore 
C{Gliu k Pu kl ) > C(G h kluM P u J must hold. This means that 
the proposed protocol has a strictly higher maximum WSR 
than the benchmark protocol. 

IV. RA Algorithm design 

A. Formulation of the RA problem 

To formulate the WSR maximization problem for the 
adopted protocol (which can be either the proposed or bench- 
mark protocol), we define 

q — / ^kiu if the proposed protocol is adopted, 
^ if the benchmark protocol is adopted. 

For any configuration of transmission-mode selection, sub- 
carrier pairing and assignment to users used by the adopted 
protocol, suppose m subcarrier pairs are assigned to the relay- 
aided transmission, then it is always possible to one-to-one 
associate the unpaired subcarriers in the two slots to form 
K — to virtual subcarrier pairs, each allocated to possibly two 
different users for direct transmission separately. Motivated by 
this observation, the RA problem is formulated by defining the 
following variables: 

• tkiu G {0, 1} for any combination of k, I, u. tkiu = 1 
indicates that subcarrier k is paired with subcarrier I for 
the relay-aided transmission to user u. 

• Pkiu > for any combination of k, I, u. When tkiu = 1, 
Pki u is used as the total power for the subcarrier pair 

(k,l). 

• tkiab G {0, 1} for any combination of k, I and a, b G U. 
tkiab = 1 indicates that subcarrier k is assigned in the 
direct transmission mode to user a during the first slot, 
and so is subcarrier I to user b during the second slot. 

• oiuiab > and (3kiab > for any combination of k, I, a, b. 
When tkiab = 1, P s ,k,i and P s ^ j2 take the value of a k iab 
and l3 k i ab , respectively. 

Let us collect all indicator and power variables in the sets I 
and P, respectively, and define S = {I, P}. Every feasible RA 
scheme can be described by an S satisfying simultaneously 

tkiu, tkiab G {0, 1},^ k,l,u,a,b, (14) 
Yl tklu + J2 tklab I = V fc ' (15) 

I \ u a.b J 

]rhr^ u +]Tt fe ia6 =i,vz, (i6) 

k \ u a.b I 

^ {tkluPklu + tklab{dklab + fiklab)) < Pt, (17) 
k,l,u,a,b 

Pkiu > 0,a k iab > 0,/3 k iab > 0,V k,l,u,a,b, (18) 
Pkiu = if tkiu = 0,Vk,l,u,a,b, (19) 

dklab = 0, Pklab = if tkiab = 0, V ft, I, u, a, b, (20) 

where ( fT~5T > and (TToT l guarantee the OFDMA, i.e., every subcar- 
rier is used exclusively for the transmission of message bits to 



a unique user. (fTTI i and ( fT8l ensure the total power constraint is 
satisfied. The constraints dT9b and (120b are added to guarantee 
that every S is one-to-one mapped to a new variable for the 
change of variable (COV) proposed later to solve the RA 
problem. 

Note that an S satisfying (TT~4-b - d20b indicates a unique 
feasible RA scheme for the adopted protocol. Viewed from 
the other way around, any feasible RA scheme can also be 
described by an S satisfying those constraints. Interestingly, 
the same feasible RA scheme might be described by multi- 
ple different S all satisfying these constraints. For instance, 
consider the scenario where there is only a single user u, and 
the RA scheme requiring messages to be transmitted in the 
direct mode, respectively, through subcarriers k\ and k^ during 
the first slot and subcarriers l\ and I2 during the second slot. 
This RA scheme can be described by using either an S with 
tkihuu = tk 2 i 2 uu = 1 and tk 1 i 2 uu = tk 2 huu = 0, or another 
S' with tk x i 2 uu = tk 2 i lU u = 1 and tk 1 i lUU = tk 2 i 2 uu = 0. 

Given a feasible S, the maximum WSR for the adopted 
protocol is 

/(S)= Y (tkiuW u C(G klu P k i u )+ (21) 

k,l,u,a,b 

tklab(w a C(G sa ,kCtklab) + Wt,C{G s b,lPklabj) > 

where w u > is the weight prescribed for user u. The WSR 
maximization problem is to solve 

(PI) max /(S) s.t. CH}-© 

for a globally optimum S. We will develop an algorithm in 
the following subsections to find it, after which the optimum 
subcarrier assignment and source/relay power allocation can 
be computed according to the analysis in Section IIII-AI 

B. The idea behind the RA algorithm design 

Note that (PI) is a nonconvex program consisting of both 
continuous and binary variables, thus in general its duality 
gap is not zero. Similar nonconvex optimization problems 
for multicarrier systems exist in the literature [1311 . [1321 . A 
possible approach to tackle them is to show their duality gaps 
approach zero when a sufficiently large number of subcarriers 
is used. This justifies the use of the dual method to find an 
asymptotically optimum solution. 

Here, we use a continuous-relaxation based approach to find 
at least an approximately optimum S for (PI). Similar methods 
were also used in l33l . l34l to compute asymptotic capacity 
regions. Specifically, all indicator variables are first relaxed to 
be continuous within [0, 1], after which we get a new problem 

(P2) max /(S) 

s.t. tki u , tkiab G [0, 1],V k,l,u, a, b, (22) 

as-©, 

as a relaxation of (PI). Define the feasible set of (P2) as Fs- 
Obviously, the feasible set of (PI) is a subset of Fs- 

Then, we^ make the COV from P to P = 

{Pkiu, ctkiab, /3fciab|Vfc, I, u, a, 6}, where every P k i u , ot k iab and 
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(3 k i ab satisfy, respectively, 

Pklu = tkl u Pklu,&klab = tklab a Mab: Pklab = tklabPklab- 

(23) 

After the COV, we collect all variables into X = {I, P}. It 
is important to note that an S £ Fs is one-to-one mapped to 
an X G Fx, where Fx contains the set of all X's satisfying 
J, (fl5t-(fl6l). as well as 



k,l,u,a,b 



Pklu + OLklab + Pklab ) < Pt, 



(24) 



Pklu > 0, auab > 0, hiab > 0, V k, I, u, a, b, (25) 
Pklu = if t klu = 0, V k,l,u,a, b, (26) 

Uklab = 0, fiklab = if tki ab = 0, V k, /, u, a, 6. (27) 

As a function of X £ Fx, the WSR can be rewritten as 

g(X) =/(S(X)) 

= ^ (w u (/>(tkiu,Pkiu,Gkiu) (28) 

+ W a (j)(t k i ab ,aklab,G aa: k) + W b (f>(t k i ab , Pklab,G s bj)) , 

where S(X) represents the S corresponding to the X E Fx 
and 



(Kt,x,G) 



iC(Gf) if t > 0, 
if t = 0. 



(29) 



It can readily be shown that cf>(t, x, G) with fixed G is a 
continuous and concave function of t > and x, because it is 
a perspective function of C(Gx) which is concave of x (see 
pages 89 — 90 for more details in fl35l ). Therefore, <?(X) is a 
concave function of X g Fx- 

After solving 

(P3) max ,g(X) 

s.t. <G3, CEi - (O, d - d27]i, 

for its global optimum, the S corresponding to this global 
optimum is the optimum solution for (P2). In the following 
subsection, we will focus on solving the problem 

(P4) max g(X) 

s.t. d22>, (H2]) CU), d24j> ([25]>, 

which is a relaxation of (P3) by omitting d26l i and ([27). 
Obviously, Fx is a subset of the feasible set of (P4). Most 
interestingly, (P4) is a convex program, which can be solved 
by highly-efficient convex-optimization techniques. Define the 
optimum objective value for (PI) and (P4) as /* and g*, 
respectively. According to the relaxations we made, 

g* > max g(X) = max /(S) > f* 

follows. Define a global optimum for (P4) as X*. If we can 
find an X* that satisfies (l26b and i27i . and contains binary 
indicator variables (i.e., tkimtkiab £ {0,1}, V k 7 l,u,a,b), 
then it can readily be shown that S(X*) must be a global 
optimum for (PI). 



In practice, it may be difficult to find precisely a global 
optimum X* for (P4) in general. For instance, existing convex- 
optimization techniques such as the interior-point method 
or the dual method all search for the global optimum in 
an iterative manner, and finally produce an approximately 
optimum solution with an objective value very close to the 
optimum value. Motivated by this fact, suppose a solution X' 
which satisfies 

1) (|26| | and (|27| | and all indicator variables in X' are binary; 

2) g* — <?(X') is very small; 

can be found for (P4), then S(X') is feasible for (PI) and 
/* ~ /(S(X')) is also very small because 

r-/(S(X'))< 5 *-ff(X'), 

which means that S(X') can be taken as an approximately 
optimum solution for (PI). 

In the following subsection, we use the dual method to solve 
(P4). Specifically, the ellipsoid method is used to search for 
the dual optimum. This ellipsoid method is reduced to the 
bisection method to update upper and lower bounds for the 
dual optimum iteratively until convergence. In some cases, 
the global optimum for (PI) can be found, while in other 
cases we explain by theoretical analysis and illustrate by 
numerical experiments that, the optimum solution for the 
Lagrangian relaxation problem (LRP) of (P4) corresponding 
to the upper bound produced after convergence can be taken 
as the X' described above. Then, S(X') can be output as an 
approximately optimum solution for (PI). 

C. The development of the RA algorithm 

Since (P4) is a convex program and it satisfies the Slater 
constraint qualificatior@, (P4) has zero duality gap (see page 
226 of [35)), which justifies the use of the dual method to solve 
(P4). To this end, [i is introduced as a Lagrange multiplier for 
the constraint d24l . The LRP for (P4) is 



(P5) 



max L{p,, X) = g(K) + fi ( P t - P(X) 



s.t. 



where X) is the Lagrangian of (P4) and -P(X) is the 
left-hand side of d24i i (i.e., the sum power as a function of 
X). A global optimum for (P5) is denoted by X p . The dual 
function is defined as d(/j,) = L^X^), which is a convex 
function of /i. In particular, 

7 (/i)=P t -P(X /1 ) (30) 

is a subgradient of d(fi), i.e., it satisfies 

V//,d(//)>d(/i) + (//-/i)7(M), (3D 
and the dual problem is to find the dual optimum 

/i* = argmin(i(yu). (32) 

Since (P4) has zero duality gap, the following properties 
hold: 

2 There exists at least an X satisfying all inequality constraints strictly. 
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• Note that fi* represents the sensitivit^ of the opti- 
mum objective value for (P4) with respect to P t , i.e., 

' = fx*. Obviously, g(X*) is strictly increasing of 
Pt, meaning that fj,* > 0. 

• (j, = H* and X M = X* are true if and only if X M 
is feasible and /U7(/x) = is satisfied according to 
Proposition 5.1.5 in l36l . This means that /j,*j(/j,*) = 0. 
Moreover, X u = X* if = 0. 

The idea behind the dual method to solve (P4) is to search 
for fx*. Then, the X M * that satisfies 7(/Lt*) = can be taken 
as X*. The key to the dual method consists of two procedures 
to find X M for a given fj, > and /Lt*, respectively, which are 
developed as follows. 

1 ) Finding X M when fi > 0: The following strategy is used 
to find X M for (P5) when fi > 0. First, the optimum P for (P5) 
with fixed I is found and denoted by Pi. Define Xi = {I, Pi}. 
Then we find the optimum I to maximize L(fi, Xi) subject to 
( l22l >. ( fT31 ) and ( fT6l i. Finally, Xi corresponding to this optimum 
I can be taken as X M . 

Suppose I is fixed, we find Pi as follows. Specifically, every 
Pkiu in Pi is equal to when t k i u = 0. When tkiu > 0, the 
optimum Pkiu can be found by using the KKT conditions 
related to Pkiu- In summary, the optimum Pkiu can be shown 
to be 



solving 



Pkiu — tklv,A(w u , /X, Gklu), 



(33) 



where A(w u ,n,G) is defined as A(w u ,fi, G) 

2ft, G 



In a similar way, the optimum akiab 
and ftkiab can be shown to be 



Othlab = tklabA(w a , (J,,G sa ,k), 
Pklab = tklabA(wb, fJ,,G s b,l), 



(34) 
(35) 



respectively. Using these formulas, Xi = {I, Pi} can be 
found. It can readily be shown that 

Xi) = /xP t -I- ^2 (tkiuAkiu + tkiabBkiab) (36) 

k,l,u,a,b 

where 

Akin =w u C{G k iuA{w u , n, Gkiu)) - V ■ A(w u , /j,, G k i u ) 

Bklab =W a C(G saik A(w a , /i, G S a,k)) ~ H " A(w Q , /i, C7 S a ifc ) + 

w b C(G sbt iA(w b , pi, G sb> i)) - /i • A(w 6 , /i, G sb j). 

Finally, we find the optimum I for maximizing L(fi, Xi) 
subject to (|22] >. (TT~5T > and (TToT i. This problem is equivalent to 

3 Note that the sensitivity analysis was introduced in pages 249-253 of 1351 
for a convex minimization problem. It can be proven that '- = /Lt* by 
casting the problem (P4) into an equivalent convex minimization problem. 
The proof is straightforward and omitted here due to space limitation. 



max y y HkluAklu + tkiabBkiab) 

I,{t kl \Vk,l} -ff ^ K ' 

s.t. ^ w = l,Vfc, (37) 
i 

k 

tkl = 2^ tklu + ^ tklab, V fc, i. 
« a,b 

tkiu > 0, tfe/ab > 0, V k, /, u, a, b. 

Note that the inequality J2 u ,a,b i^kluAkiu + tkiabBkiab) < 
tuCki holds where C k i = max{max„ A k i u , max ,6 B klab }. 
Let us call Aki u as the metric for t k i u and B k i a b as the 
metric for t k i ab . This inequality is tightened when all entries 
of {t k iu, tkiab\y u,a,b} are assigned to zero, except that the 
one with the metric equal to Cm is assigned to t k i- 

Therefore, after the problem 



max y~] V" tkiC k i 

K,l u,a,o 

s.t. 5Z**' = 1 ' V *' (38) 
i 

k 

t k i >0,Vk,l, 

is solved for its optimum solution {^|V k, I}, an optimum I 
for (|37| | can be constructed by assigning for every combination 
of k and /, all entries in {tkiu, tki ab\^ u,a,b} C I to zero, 
except for the one with the metric equal to Cki to t\,. 

Most interestingly, (l38l l is a standard assignment prob- 
lem, hence every entry in {fj£;|V k, 1} is either or 1 
and {t*.;|V k, 1} can be found efficiently by the Hungarian 
algorithm 1371 . After knowing {f*.;|V k, I}, the optimum I can 
be constructed according to the way mentioned earlier. Finally, 
the corresponding Xi = {I, Pi} is assigned to X M . Note 
that to compute X M , {Aki u , Bki a b\^ k,l,u, a,b} containing 
K 2 (U + U 2 ) entries has to be computed first, which implies a 
complexity of 0(K 2 U 2 ). Moreover, the Hungarian algorithm 
to solve d38l l has a complexity of 0(K 3 ) ll37l . This means 
that the complexity of finding X u is 0(K 2 U 2 + K 3 ). 

2) Finding /i*: To find /i*, an incremental-update based 
subgradient method which updates \i with \i = — S(P t — 
P(X M ))] + can be used, where S > is a prescribed step 
size 11361 . However, this method converges very slowly, since 
S has to be very small to guarantee convergence. To speed 
up the search for /j,*, we use the ellipsoid method. The idea 
behind the ellipsoid method is to find a series of contracting 
ellipsoids that always contain /j,* l35ll . The ellipsoid method 
can be reduced to the bisection method as follows. 

First, a lower bound and an upper bound yu u for n* 
are initialized. As said earlier, /Lt* > holds, thus \i\ can 
be initialized with 0. As shown in the Appendix, /Lt u can 
be initialized with KWmix * log2 e . Then, \x\ and /i u are up- 
dated iteratively as follows. In every iteration, X^ where 
Mm = ^ 2 " ^ s computed. If j(fJ. m ) > 0, then V [i > fJ, m , 
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d(n) > d(fj, m ) + 0" ~ fha)l/(jhn) > d{^i m ). This means that 
//* must be confined in [fi\, fj, m ], so fi u should be updated with 
fi m . If j((J* m ) < 0, it can be shown similarly that fi\ should be 
updated with /i m . The iteration is terminated when 7(/U m ) = 
or n u — fi\ < e where e > is a prescribed small value. 

When the iteration is terminated with j(n m ) = being 
satisfied, X* = X Mm must hold as said earlier. Note that 
S(X Mm ) must be a global optimum for (PI) since X^ m 
satisfies ( f26b and (|27l i, and contains binary indicator variables 
as said in Section IV.B. 

We now consider the case where the iteration is terminated 
with n-a — n\ < e being satisfied. In such a case, we find 
that X Mu is an approximately optimum solution for (P4). This 
finding will be illustrated by numerical experiments in Section 
V. It can be explained by theoretical analysis as follows. Note 
that 

9* - ff(X,J < dtp*) - 5 (X Mn ) = ^u7(Mu) (39) 

holds since V fi > 0, g* < d(fi). In addition, we present the 
following lemma: 

Lemma 1: 7(/i) is an increasing function of fj, > 0. 
Proof: Suppose > (j,2- According to ( Bil l, 

d(fii) > d(fi 2 ) + (Ml - ^2)7(^2) 
d(fJ, 2 ) > d(m) + (fj, 2 - MiMMi) 

follow. As a result, 

(/ii - ^2)7(^1) > d(m) - d(fj, 2 ) > (m - ^2)7(^2) 

holds, and thus "f(fXi) > 7(^2)- This completes the proof. ■ 
According to Lemma 1, 7(/i u ) > lit 1 *) — because 
Mu > /i*, meaning that X Mu is always feasible for (P4). 
Moreover, /j, u ~f(fi u ) reduces as the iteration proceeds and it is 
very small after convergence, since fi u decreases to approach 
H* which satisfies /i*7(/i*) = 0. This means that g* — g(X Alu ) 
is very small according to (|39"V Moreover, X Mu also satisfies 
(|26] > and $T7\ and all indicator variables in X /Ju are binary. 
This means that S(X Mu ) can be output as an approximately 
optimum solution for (PI) as said in Section IV.B. 

The overall procedure to find an approximately optimum 
solution for (PI) is summarized in Algorithm Q] Its complexity 
can be studied as follows. First, {Gkiu\^ k,l,u} needs to be 
computed, which needs K U operations. Then, finding fj,* 
with the bisection method requires at most a number of iter- 
ations in the order of log 2 (i^). For each iteration, computing 
X p has a complexity of 0(K 2 U 2 + K 3 ). Therefore, the total 
complexity of Algorithm □ is 0(\og 2 (K)(K 2 U 2 + K 3 )). 

V. Numerical experiments 

In numerical experiments, we consider the relay-aided 
downlink OFDMA system illustrated in Figure [5] The relay 
is located in the line between the source and the center of 
the user region, and the source-to-relay distance is d km. 
U = 5 users are served and they are randomly and uniformly 
distributed in a circular region of radius 50 m. Their weights 
are randomly chosen between 0.8 and 1.2 for every system 
realization simulated. For Algorithm 1, e is set as 10 -6 , which 



Algorithm 1 The RA algorithm to find an approximately 
optimum S for (PI) 



1: compute Gkiu, V k,l,u. 



Pt 



\i\ = 0; (i u 

while fj, u — > e do 

/^m — 2 ' 

solve (P5) with /i = /i m for X Pm ; compute 7(/z m ); 

if 7(Atm) = then 

compute S (X^ m ) and output it as an optimum solu- 
tion for (PI); 
exit the algorithm; 

else if 7(^ m ) > then 

else 

Mi = /-* m ; 
end if 
end while 

solve (P5) with [i = /i u for X Mu ; 

compute S(X Alu ) and output it as an approximately opti- 
mum solution for (PI). 



d kr 



/0©\ 



1 kr 



Fig. 5. The relay-aided downlink OFDMA system considered in numerical 
experiments. 



leads to at most log 2 ( ^""' m °^ los 2 e ^ ~ 21 + i g 2 (iL) iterations 
for a given combination of K and P t . 

The channels are independent of each other and generated 
in the same way as in (TJ, (3). For every user u, the impulse 
response of the source-to-u channel is modeled as a delay 
line with L = 6 taps, which are independently generated from 
circularly symmetric complex Gaussian distributions with zero 
mean and variance equal to -^(^r) , where d rc f = 1 
km and d su represents the source-to-u distance. The source- 
to-relay and relay-to-u channels are generated in the same 
way, with each tap having the variance as ^(g^-) and 
x(cP7) 2 ' 5 , respectively, where d ru represents the relay- 
to-w distance. The CSI {/i S r,fc|V fc}, {h su ^\V k,u} and 
{h TU ,k\y k,u} are computed by making A' -point FFT over 
the impulse response of the associated channels. 

In order to illustrate the benefit of optimized subcarrier pair- 
ing and opportunistic DF relaying, we also consider another 
benchmark protocol (BP-2) in addition to the already studied 
benchmark mark protocol (BP-1). BP-2 is the one studied in 
ll25l using a single relay, i.e., subcarrier k in the first slot 
and subcarrier k in the second slot are allocated to a user for 
either the relay-aided transmission or the direct transmission 
separately. The RA algorithm proposed in ll25l is used for 
BP-2. 

According to the analysis in Section IV.C, S(X Mu ) is finally 
output as an approximately optimum solution if the iteration 
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is terminated with fx n — /Ui < e being satisfied. In such a case, 
/* — /(S(X Mu )) < /i u 7(/i u ) after convergence, and 



Average WSR 



Mu7(Mu) 

/(S(X„J) 



(40) 



can be computed to evaluate the relative difference between 
the WSR finally achieved and the maximum WSR for (PI). 

To illustrate the effectiveness of Algorithm 1, we have 
executed Algorithm 1 for both the proposed protocol and 
BP-1 over 10 4 random system realizations. Specifically, the 
system realizations are generated by randomly choosing a 
combination of d e [0.1,0.9] km, K G {8,16,32,64,128}, 
Pt/u 2 £ [0,45] dB, then generating the channels as said 
earlier. It can readily be shown that at most 28 iterations are 
executed for Algorithm 1 for every random channel realization 
generated. The S(/i u ) is evaluated and collected for all system 
realizations when the iteration of Algorithm 1 terminates with 
Mu — A f i < € being satisfied. The probability density function 
(PDF) of these 6(fx u ) in dB scale (i.e., 10*logi (6(Mu))) is 
shown in Figure [6] It can be seen that S(fi u ) is always smaller 
than 3%, which indicates that the finally produced S(X Mu ) 
is indeed an approximately optimum solution with a WSR 
very close to the maximum WSR for (PI) if the iteration is 
terminated with /x u — /ii < e being satisfied. 

pdf of 10log 10 (8(n m )) 




-80 -60 
10-!og10(5(u )) 



Fig. 6. The PDF of 10 * logl0(5(fj, u ) simulated over 10 4 random system 
realizations. 

To show the impact of relay position on the protocols' 
performance, we choose Pt/cr 2 = 20 dB and K = 32, 
then evaluated the average optimum WSRs and for 
every protocol over 1000 random channel realizations when d 
increases from 0.1 to 0.9 km. Here, N sp denotes the average 
number of the subcarrier pairs that should be used in the relay- 
aided mode to maximize the WSR. It can readily be computed 
that at most 20 iterations is executed for Algorithm 1 for every 
channel realization generated. The results are shown in Figure 

When d is fixed, the proposed protocol leads to a greater 
average optimum WSR than BP-1, which illustrates the the- 
oretical analysis in Section IIII-BI Moreover, the proposed 
protocol and BP-1 both have greater average optimum WSRs 
than BP-2. This is because they can better exploit the degrees 
of freedom for subcarrier pairing and assignment to users than 
BP-2 to improve the spectrum efficiency. 




(b) 



Fig. 7. The average optimum WSRs and — g£ as the relay position changes 
when P t /a 2 = 20 dB and K = 32. 



It is interesting to observe that for every protocol, the opti- 
mum WSR is higher and it is more likely to pair subcarriers 
for the relay-aided transmission to maximize the WSR when 
the relay moves toward the middle between the source and 
the user-region center. This behavior is interpreted for the 
proposed protocol as follows (those for BP-1 and BP-2 can 
be interpreted in a similar way and thus omitted due to space 
limitation). It is important to note that the optimum WSR for 
the proposed protocol, as the optimum objective value of (PI), 
depends on {G S tt,fc, G",JV k, I, u}. If V fc, I, u, G", u is more 
likely to take a high value, the subcarriers are more likely 
to be paired for the relay-aided transmission to maximize 
the WSR, and the average optimum WSR for the proposed 
protocol increases. As can be seen from Fig. [3] G^ lu is high 
if both G sr .fe and G u .i are much greater than G SUt k- When 
the relay lies in the middle between the source and the user- 
region center, both G sl and G u j are likely to be much greater 
than G SUj fe, meaning that G£, M is likely to be high. Therefore, 
the optimum WSR is higher and it is more likely to pair 
subcarriers for the relay-aided transmission when the relay lies 
in the middle between the source and the user-region center. 

When d is small, the optimum WSR for the proposed 
protocol is much greater than that for BP-1, and it is more 
likely to pair subcarriers for the relay-aided transmission to 
maximize the WSR for the proposed protocol than for BP-1. 
This can be explained as follows. Note that if G'^ lu — G\ lu 
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is very likely to be high V k,l,u, the proposed protocol 
is more likely to pair the subcarriers for the relay-aided 
transmission than BP-1. According to the analysis in Section 



kl (. 



increases when G sv k increases or G T 



reduces. When d is small, G sr) fe and G vu j are very likely to 
be high and small, respectively, meaning that G^. lu — G\ lu is 
very likely to take a high value. This explains the observation. 
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Fig. 8. The average optimum WSRs and as the relay position changes 
when Pt/a 2 = 45 dB and K = 32. 

We also evaluated the average optimum WSRs and for 
every protocol over 1000 random channel realizations when 
Pt/cr 2 = 45 dB and K = 32. It can readily be computed that 
at most 12 iterations is executed for Algorithm 1 for every 
channel realization generated. The results are shown in Figure 
|8] It can be seen that regardless of the relay position, almost 
all subcarriers are used for the direct transmission to maximize 
the WSR for every protocol, therefore all protocols have 
similar average optimum WSRs. This can be interpreted as 
follows. Note that when the subcarrier pairing and assignment 
to users are fixed for every protocol, the optimum sum power 
for the subcarrier pairs and the optimum power for unpaired 
subcarriers can be found by the water-filling method. Since 
Pt/cr 2 is very high, the optimum sum power allocated to 
subcarriers k and I is very likely to be high if they are paired 
for the relay-aided transmission to a user. In such a case, 
it can readily be shown that splitting this high sum power 
to the two subcarriers for separate direct transmission to the 
same user can result in a higher WSR. This explains why 
almost all subcarriers are used for the direct transmission to 



maximize the WSR when P t /a 2 is very high. It also indicates 
that the proposed protocol leads to a better optimum WSR 
performance than the benchmark ones especially for the low- 
power regime. 

VI. Conclusion 

In this paper, we have addressed the WSR maximization 
problem for the DF relay-aided downlink OFDMA transmis- 
sion under a total power constraint. A novel subcarrier-pair 
based opportunistic DF relaying protocol has been proposed. 
A benchmark protocol has also be considered. An algorithm 
has been designed to find at least an approximately optimum 
RA with a WSR very close to the maximum WSR. Numer- 
ical experiments have illustrated the effectiveness of the RA 
algorithm and the impact of relay position and total power 
on the protocols' performance. Theoretical analysis have been 
presented to interpret what were observed in numerical exper- 
iments. 
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Appendix 
An upper bound for ix* 

An initial upper bound for n* can be found as follows. 
According to Proposition 5.5.1 in |36l , X* must satisfy X M « = 
X* and fj*(P t - P(X*)) = 0. Since ii* > 0, P(X*) = 
P t must be satisfied. According to the derivation to find X^, 
the power and indicator variables in X* must satisfy 03)- 
([35]). It can readily be seen that the P* lu , a* klab and f3* klab 

; lo S2 



in X* 



are smaller than t klu - 



t* 

L klab 



: lQg 2 g 



and 



2fj,* ' "klab 2jU* 

2^'° g2 6 1 respectively; where t klu and t k[ab represent 
the value of t^ u and tkiab m X*. Therefore, the inequality 



L klab 



P=P(X*)< J2 (*«« + 2*Wa6)- 



:l0g 2 



< 



k,l,u,a,b 

Kw„ 



t* )- 

L klab) 



2fl* 

iax log 2 e 
2ji* 



(41) 



: log 2 e 



follows where 

Kwjoax log 2 e 



= m&x u {w u }, meaning that /i* < 



must be satisfied. Therefore, 
used as an initial upper bound for fj,*. 



Ki 



s 2 — can be 
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